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[~n The use of improved covariance matrix estimators as an alternative to the sample estimator 

1 . _ an ,„po... .pp... e„Mn.n. po.M. op«„...o„. He. ,e e„ 
3 compare the performance of 9 improved covariance estimation procedures by using daily returns of 

90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance 



matrix estimators strongly depends on the ratio between estimation period T and number of stocks 



short selling is allowed, several estimation methods achieve a realized risk that is significantly 
smaller than the one obtained with the sample covariance method. This is particularly true when 



^ on the presence or absence of short selling, and on the performance metric considered. When 
(N 

O T/N is close to one. Moreover many estimators reduce the fraction of negative portfolio weights, 

^ while little improvement is achieved in the degree of diversification. On the contrary when short 

> 

selling is not allowed and T > the considered methods are unable to outperform the sample 
^ covariance in terms of realized risk but can give much more diversified portfolios than the one 

obtained with the sample covariance. When T < N the use of the sample covariance matrix and 
of the pseudoinverse gives portfolios with very poor performance. 
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I. INTRODUCTION 



Portfolio optimization [TH3] is one of the main topics in quantitative finance. Markowitz's 
solution to the portfolio optimization problem, the mean-variance efficient portfolio, relies 
upon a series of assumptions and is constructed by using first and second sample moments 
of financial asset returns. Although analytical and elegant, Markowitz solution to the port- 
folio optimization problem turns out to be highly sensitive to estimation errors of sample 
moments. For this reason many moment estimators have been proposed to improve the per- 
formance of the portfolio optimization. Furthermore the typical outcome of the Markowitz 
optimization procedure, especially for large portfolios, is characterized by large negative 
weights for a certain number of assets of the portfolio [4-6J. Negative portfolio weights 
require to take a short position (selling an asset without owning it) which is sometimes 
difficult to implement in practice, or forbidden to some classes of investors. For this reason 
it is quite widespread to constrain portfolio weights in the optimization procedures. 

In the present study, we focus on the role played in the portfolio selection by estimation 
errors of the second moments of asset returns, both when taking short selling positions 
is allowed and when it is forbidden. We can ignore estimation errors of asset returns by 
restricting our attention to the global minimum variance portfolio, where asset returns are 
not involved [7j. It is to notice that this choice is not a limiting one. In fact, the global 
minimum variance portfolio is typically characterized by an out-of-sample Sharpe ratio (the 
ratio between the portfolio return and its standard deviation, a key portfolio performance 
measure) which is as good as that of other efficient portfolios [6[ |8]. Indeed, there is a 
consensus on the view that benefits of diversification can be achieved from risk reduction 
rather than from return maximization [8j. Furthermore, the determination of expected 
returns is the role of the economist and of the portfolio manager who are asked to generate 
or select valuable private information, while estimation of the covariance matrix is the task 
of the quantitative analyst [9j . 

The simplest estimator of the covariance matrix of asset returns is the sample covari- 
ance estimator, which has N x (N + l)/2 (^ A^/2 when N is large) distinct elements. For 
an estimation time horizon of length T, the number of available data is A x T. A very 
common circumstance in portfolio selection is that the number of assets A is of the same 
order of magnitude as the estimation time horizon T, for example because non stationarity 
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problems arise for large T, or because the portfolio is very large. In this case, the total 
number of parameters to be estimated is of the same order of magnitude as the total size 
of available data. This unavoidable lack of data records generates large estimation errors in 
the sample covariance matrix, and thus covariance filtering methods are especially useful, 
in order to reduce the estimation error. Here we discuss and compare the performance of 
portfolios obtained by using several estimators of the covariance matrix. We perform the 
comparison of portfolio selection methods at different time horizons T, and we consider the 
portfolio optimization problem both with and without including short selling constraints. 
Specifically, we apply portfolio optimization methods to 90 highly capitalized stocks traded 
at the New York Stock Exchange (NYSE) during the time period from January 1997 to De- 
cember 2005. We find the global minimum variance portfolio both with and without short 
selling constraints at diflFerent time horizons. The investment and estimation horizons are 
chosen to be identical, and range from one month (approximately T = 20 trading days) to 
two years (approximately T = 480 trading days). We compare the performance of 10 co- 
variance matrix estimators, namely the sample covariance estimator used in the Markowitz 
optimization, three estimators based on the spectral properties of the covariance matrix [10]- 
[14] , three estimators based on hierarchical clustering p!5] - [T9] , and three estimators based on 
shrinking procedures [6l O EOj [2T] . 

We find that the effectiveness of the last 9 covariance estimators with respect to the 
sample estimator in portfolio optimization depends on the presence or absence of short 
selling, on the performance metric considered, and on the ratio T/N. Specifically, when 
short selling is allowed, several covariance estimators are able to give portfolios significantly 
less risky than the Markowitz portfolio. This is particularly true when T/N is close to one 
in agreement with previous observations that Markowitz portfolio optimization can be quite 
problematic and ineffective in the T/N ^ 1 regime [221425] . Moreover for a wide range of 
T/N^ we verify that portfolios obtained by using the proposed estimation procedures have 
a lower proportion of negative over positive weights (amount of short selling) [6j than the 
Markowitz optimal portfolio, especially when T/N ^ 1. However the degree of effective 
diversification of the portfolio is similar for different methods (including Markowitz). 

The situation is significantly different when short selling is forbidden. When T > N 
the realized risk of Markowitz portfolio becomes comparable to that of the other portfolios. 
In this respect the tested estimators are not able to give portfolios significantly less risky 
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than the Markowitz one and all the tested estimators have very similar risk. However 
the portfolios obtained with these estimators are significantly more diversified than the 
Markowitz portfolio. 

When T < N the inverse of the sample covariance matrix does not exist because it has 
zero eigenvalues. It has been proposed to use the pseudoinverse to extend the Markowitz 
optimization to the case T < N. We find that portfolios obtained with the pseudoinverse 
are more risky and less diversified than the other portfolios. 

By comparing portfolios with and without short selling we also verify and generalize 
the observation that including constraints (such as the no short selling constraint) in the 
portfolio optimization procedure is similar to perform an unconstrained optimization with 
a filtered covariance matrix (see Ref. [6j for shrinkage estimators and Ref. [26j for some 
covariance estimators based on spectral properties). 

The paper is organized as follows. In Section II we discuss basic aspects of the Markowitz 
portfolio optimization procedure and set the notation. In Section III we describe the inves- 
tigated covariance matrix estimators. Section IV presents the data set, the methodologies 
used to compare the different portfolios, and the empirical results. Section V concludes. 

II. MARKOWITZ PORTFOLIO OPTIMIZATION 

In this section we briefiy discuss some basic aspects of portfolio optimization in Markowitz 
framework. This is also useful to set the notation and state the assumptions made and the 
methods used. 

Given stocks, at time to an investor selects his/her portfolio of stocks by choosing 
a fraction of wealth Wi to invest in stock i, with i = 1, A^, in order to have maximum 
profit and minimum risk from his/her investment at a fixed time to + T in the future. The 
A^-dimensional column vector of the weights w is normalized as w^l^y = 1, where 1^ is the 
A^-dimensional column vector of ones. The average return and the variance of the portfolio 
are 

Vp = w^m and = w^Sw, (1) 

respectively, where m and S are the A^-dimensional column vector of mean returns and 
the N X N covariance matrix of the stocks, respectively. Markowitz optimization problem 
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consists in finding the vector w which minimizes ap for a given value of r^. The choice of using 
the standard deviation as a measure of risk is based on the assumption that returns foUow a 
Gaussian distribution. If one does not set any constraint on the value of the weights, allowing 
them to be either positive or negative, Markowitz solution to the optimization problem [2j 
is 

w* = AS-^l^ + 7S-^m (2) 

where 

C - r^B r^A - B 

A ^ A 

A = l?^S-4^ B = l^S"^m 

C = m^S-^m A = AC- B^. 

The inverse of the parameter 7 is usually referred to as risk aversion. 

When 7 = (infinite risk aversion), the optimal portfolio is the global minimum vari- 
ance portfolio and it does not depend on expected returns. Since in this paper we aim 
to investigate the role of estimation risk of the covariance matrix, we focus on the global 
minimum variance portfolio, as done in Ref.s [6l0[9j, which obviously does not depend on 
the estimation error of mean returns. Markowitz optimization typically gives both positive 
and negative portfolio weights and, especially for large portfolios, it usually gives large neg- 
ative weights for a certain number of assets [4]-[6] . A negative weight corresponds to a short 
selling position (selling an asset without owning it) and it is sometimes difficult to imple- 
ment in practice or forbidden. For this reason it is common practice to impose constraints 
to the portfolio weights in the optimization procedure. When one adds constraints on the 
range of variation of the WiS the optimization problem cannot be solved analytically, and 
quadratic programming must be used. Quadratic programming algorithms are implemented 
in most numerical programs, such as Mat lab or R. In the following we will consider the 
portfolio optimization problem both with and without the no short selling constraint Wi > 

= 1,...,A^. 

III. COVARIANCE MATRIX ESTIMATORS 

One of the main problems of portfolio optimization is the estimation of the mean returns 
vector m and covariance matrix S. For the global minimum variance portfolio the investor 
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needs only to estimate E. In what follows we estimate the covariance matrix by using 
past return data. Specifically, at time to we estimate the sample covariance matrix of daily 
returns in the T trading days preceding to- We then apply the different estimators and 
calculate the optimal portfolio. This portfolio is held until time + T when we evaluate 
its performance. Note that our estimation and investment time horizons are chosen to be 
the same. We consider three classes of estimators: i) spectral estimators, ii) hierarchical 
clustering estimators, and iii) shrinkage estimators. 

A. Markowitz direct optimization 

Let us first point out some aspects associated with the Markowitz direct optimization. In 
this case, the estimator of the covariance matrix at time to is the sample covariance matrix 
estimated on the preceding T days. The input to the global minimum variance optimization 
problem is the inverse of the sample covariance matrix. When T < N the inverse of the 
sample covariance matrix does not exist because of the presence of null eigenvalues. As 
suggested in the literature (for example in Ref. [9j) in the optimization problem we use 
the pseudoinverse, also called generalized inverse [27j, of the covariance matrix. Replacing 
the inverse of the covariance matrix with the pseudoinverse in the optimization problem 
allows one to get a unique combination of portfolio weights. It should be noted that, when 
T < N ^ the optimization problem remains undetermined and the pseudoinverse solution is 
just a natural choice among the infinite undetermined solutions to the portfolio optimization 
problem. 

In the same regime T < N ^ this problem does not arise for the other covariance estima- 
tors, because they typically give positive definite covariance matrices for any value of T/N 
including T/N <1. 

B. Spectral estimators 

The first class of methods includes three different estimators of the covariance matrix, 
which make use of the spectral properties of the correlation matrix. The fundamental idea 
behind these methods is that the eigenvalues of the sample covariance matrix carry different 
economic information depending on their value. 
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The first method we consider is the single index model (see for instance Ref. PEUEH]). In 
this model stock returns r^(t) are described by the set of linear equations r^(t) = /3if{t)+£i{t)^ 
i = 1, where returns are therefore given by the linear combination of a single random 
variable, the index /(t), and of an idiosyncratic stochastic term £i{t). The parameters /3i 
can be estimated by linear regression of stock return time series on the index return. The 
covariance matrix associated with the model is S^*^^^ = aoo/3/3^+D, where aoo is the variance 
of the index, /3 is the vector of parameters /3i^ and D is the diagonal matrix of variances 
of £i. We indicate this method hereafter as SI. It can be shown that this method gives an 
estimated covariance matrix very similar to the one obtained with the method RMT-0 (see 
below) when only the largest eigenvalue of the sample covariance is assumed to carry reliable 
economic information. 

The other two spectral methods make use of the Random Matrix Theory (RMT) [TOHIS]. 
Specifically, if the variables of the system are i.i.d. with finite variance a^, then in the 
limit T, ^ oo, with a fixed ratio T/N^ the eigenvalues of the sample covariance matrix 
are bounded from above by the value 

Xma. = Ct\1 + N/T + 2^/N/f), (3) 

where = 1 for correlation matrices. In most practical cases, one finds that the largest 
eigenvalue Ai of the sample correlation matrix of stocks is definitely inconsistent with RMT, 
i.e. Ai ^ Xmax- In fact the largest eigenvectors is typically identified with the market mode. 
To cope with this evidence, Laloux et al. fTT] propose to modify the null hypothesis of RMT 
so that system correlations can be described in terms of a one factor model instead of a pure 
random model. Under such a less restrictive null hypothesis the value of Xmax is still given 
by Eq. ([s]), but now = 1 — Xi/N. Here we consider two different procedures that apply 
RMT to the covariance estimation problem. 

The first procedure has been proposed by Rosenow et al. in Ref. [13] and works as 
follows. One diagonalizes the sample correlation matrix and replaces all the eigenvalues 
smaller than Xmax with 0. One then transforms back the modified diagonal matrix in the 
standard basis obtaining the matrix H^^^^"^^. The filtered correlation matrix Q^^^^-^) 
is obtained by simply forcing to 1 the diagonal elements of H^^^^"^^. Finally the filtered 
covariance matrix S^^^^"^^ is the matrix of elements = c^^^^~^^ y/^a^jj ^ where 

^(i?MT 0) ^j^^ entries of (ji^^^-^) and an and ajj are the sample variances of variables i 
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and respectively. In the following we will refer to this method as the RMT-0 method. 

The second way to reduce the impact of eigenvalues smaller than Xmax onto the estimate 
of portfolio weights has been proposed by Potters et al. in Ref. [14j. In this case one 
diagonalizes the sample correlation matrix and replaces all the eigenvalues smaller than 
Xmax with their average value. Then one transforms back the modified diagonal matrix in 
the original basis obtaining the matrix j^i^^T-^) of elements h[^^^ ^\ It is to notice 
that replacing the eigenvalues smaller than Xmax with their average value preserves the trace 
of the matrix. Finally, the filtered correlation matrix Qi^^T-M) ^j^^ matrix of elements 



^RMT-M) ^ ^iRMT-M) , ^iRMT-M) ^(RMT-M) _ eovariance matrix S(^mt-m) used 

f-J f-J ' \ '''' J J 

in the portfolio optimization is the matrix of elements = ^{rmt-m) ^^^^^^^ ^ where 

Gii and cTjj are again the sample variances of variables i and respectively. We will refer to 
this method as the RMT-M method. 



C. Agglomerative hierarchical clustering estimators 

The second class of methods comprises three different estimators of the covariance matrix 
based on agglomerative hierarchical clustering [15j. Agglomerative hierarchical clustering 
methods are clustering procedures based on pair grouping where elements are iteratively 
merged together in clusters of increasing size according to their degree of similarity. Hier- 
archical clustering procedures therefore depends on the chosen similarity measure between 
elements of the system. In the present study we consider the correlation as a measure of 
similarity between two elements in the system. Hierarchical clustering algorithms work as 
follows. Given a data set of time series, at the the beginning each element defines a 
cluster. The similarity between two clusters is defined as the correlation coeflScient between 
the corresponding two time series. Then the two clusters with the largest correlation are 
merged together in a single cluster. At the second iteration one has to tackle the subtler 
problem of defining a similarity between clusters. Different similarities between clusters can 
be defined, each one characterizing a specific hierarchical clustering procedure. Once the 
similarity between two clusters is consistently defined, then the two clusters with the largest 
similarity are merged together, and the procedure is iterated until, after — 1 iterations, 
all the elements are grouped together in one cluster, corresponding to the whole data set. 

We consider here three hierarchical clustering procedures that differ in the definition of 
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similarity between clusters. In the unweighted pair group method with arithmetic mean 
(UPGMA) if a new cluster L is formed from clusters A and then the similarity between 
cluster L and any other cluster F is given by 

NaPa.f + NbPb.f 

where A^^ and Nb are the number of elements in cluster A and B, respectively. Within this 
rule the similarity between cluster L and cluster F is given by the arithmetic mean of the 
set {pij^ G L, andVj G F}. In the weighted pair group method with arithmetic mean 
(WPGMA) the average is weighted in such a way to get rid of the possibly different sizes of 
A and B 

PA,F + PB,F 

PL,F = ^ . (5) 

Finally, in the Hausdorff linkage cluster analysis [19] , the similarity between cluster L and 
cluster F is obtained in terms of the Hausdorff distance between the two clusters 

Pl^f = minjminmax p^j,maxmin pij}. (6) 

ieL jeF ieL jeF 

The output of any hierarchical clustering procedure is a dendrogram where each node ak is 
associated with the similarity Paj^ between the two clusters of elements merging together in 
the node aj^- One can therefore construct a filtered similarity matrix associated with a 
specific dendrogram as follows. Each entry pfj of is set to Pc^^, where ak is the node of 
the dendrogram corresponding to the smallest cluster in which the elements i and j merge 
together. The matrix is positive definite provided that its entries are non negative 
numbers [17j and that the dendrogram does not show reversals [15]. The first condition 
is typically observed in the financial case, while the latter condition is always satisfied by 
the UPGMA and the WPGMA, while it could be violated in the Hausdorff method. When 
reversals are present in the dendrogram associated with Hausdorff method, we remove such 
reversals by using the minimum spanning tree associated with the hierarchical clustering 
procedure [29j. Since our procedure generates positive definite matrices, they can be 
interpreted as correlation matrices. Once is constructed, we obtain an estimate of the 
covariance matrix by multiplying the entries of by the sample standard deviations. 
Hierarchical clustering procedures have been shown to be effective in extracting financial 
information from the correlation matrix of stock returns since Ref. [16j. It is finally 
to notice that hierarchical clustering methods have already been considered in portfolio 
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optimization in Ref. [T8] . 



D. Shrinkage estimators 

The last class of estimators comprises linear shrinkage methods. Linear shrinkage is a 
well-established technique in high-dimensional inference problems, when the size of data 
is small compared to the number of unknown parameters in the model. In such cases, the 
sample covariance matrix is the best estimator in terms of actual fit to the data but it is 
suboptimal because the number of parameters to be fitted is larger than the amount of data 
available [30] . The idea is to construct a more robust estimate Q of the covariance matrix by 
shrinking the sample covariance matrix S to a target matrix T, which is typically positive 
definite and has a lower variance. The shrinking is obtained by computing 

Q = aT + (l-a)S, (7) 

where a is a parameter named shrinkage intensity. We consider three different shrinkage 
estimates of the covariance matrix, each one characterized by a specific target matrix. 

The shrinkage to single index uses the target matrix T = S^*^^^ = aoo/3/3^ + D, i.e., the 
single index covariance matrix previously discussed. This target was first proposed in the 
context of portfolio optimization by Ledoit et al. [9j. The second method is called shrinkage 
to common covariance. The target T is a matrix where the diagonal elements are all equal 
to the average of sample variances, while non diagonal elements are equal to the average 
of sample covariances. In the shrinkage to common covariance the heterogeneity of stock 
variances and of stock covariances is therefore minimized. The method has been proposed 
for the analysis of bioinformatic data in Ref. [31j and, to the best of our knowledge, it 
has never been used in the context of financial data analysis. The third method, termed 
shrinkage to constant correlation has a more structured target and was used in Ref. [21 J . The 
estimator is obtained by first shrinking the correlation matrix to a target named constant 
correlation, and then by multiplying the shrunk correlation matrix by the sample standard 
deviations. The constant correlation target is a matrix with diagonal elements equal to 
one, and off-diagonal elements equal to the average sample correlation between the elements 
of the system. As a (the shrinkage intensity) we use the unbiased estimate analytically 
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calculated in [3T] . 

In conclusion we consider 10 covariance matrix estimators that we label: Markowitz, SI, 
RMT-0, RMT-M, UPGMA, WPGMA, Hausdorff, shrinkage to SI, shrinkage to common 
covariance, and shrinkage to constant correlation. 

IV. OPTIMIZATION PROCESS: EMPIRICAL RESULTS 

In this Section we present repeated portfolio optimizations performed by using the co- 
variance estimators discussed in the previous Section. A set of highly liquid stocks traded 
at the NYSE is used. 

A. Data 

Our dataset consists of the daily returns of = 90 highly capitalized stocks traded 
at NYSE and included in the NYSE US 100 Index. For these stocks the closure prices are 
available in the eleven year period from 1 January 1997 to 31 December 2007 [33j. The ticker 
symbols of the investigated stocks are AA, ABT, AIG, ALL, APA, AXP, BA, BAG, BAX, 
BEN, BK, BMY, BNI, BRK-B, BUD, C, CAT, CCL, CL, COP, CVS, CVX, D, DD, DE, 
DIS, DNA, DOW, DVN, EMC, EMR, EXC, PCX, FDX, FNM, CD, GE, GLW, HAL, HD, 
HIG, HON, HPQ, IBM, ITW, JNJ, JPM, KMB, KO, LEH, LLY, LMT, LOW, MCD, MDT, 
MER, MMM, MO, MOT, MRK, MRO, MS, NWS-A, OXY, PCU, PEP, PFE, PG, RIG, 
S, SGP, SLB, SO, T, TGT, TRV, TWX, TXN, UNH, UNP, USB, UTX, VLO, VZ, WAG, 
WB, WFC, WMT, WYE, XOM. As reference index in the SI model and in the shrinkage to 
single index we use the Standard & Poor's 500 index, which is a widely used broadly-based 
market index. 

At time to the portfolio is selected by choosing the optimal weights that solve the global 
minimum variance problem with or without short selling constraints. The input to the 
optimization problem is the covariance matrix estimator S'^-^^ calculated using the T days 
preceding to and obtained with one of the methods (i.e. / G { Markowitz (M), SI , RMT-0, 
RMT-M, UPGMA, WPGMA, Hausdorff, shrinkage to SI, shrinkage to common covariance, 
shrinkage to constant correlation}. We call S^^^ the estimated covariance matrix. For in- 
stance, in this notation, S^^^ is the sample covariance matrix, i.e. the one used in Markowitz 



11 



portfolio optimization. The output of the optimization problem is 

w^-^^ = argmin w^S^-^W, (8) 

w 

with the appropriate constraints. The ex post covariance matrix S is defined as the sample 
covariance matrix calculated using the T days following to- The predicted portfolio risk is 

s(/^ = Vw(/)TS(/)w(/), (9) 

and the realized portfolio risk is 

s^f^ = Vw(/)TSw(/). (10) 

Thus both sif^ and are estimated by using a time window of length T. The time window 
T is varied on a wide range. In our empirical study, we use seven different time windows T 
ofl, 2, 3, 6, 9, 12, and 24 months. In other words, we select the portfolio monthly (T 2^ 20), 
bimonthly (T 40), quarterly (T 60), six-month (T 125), nine-month (T 2^ 187), 
yearly (T 2^ 250), and biannually (T 2^ 500). Since the total number of trading days is 2761, 
we consider 131, 65, 43, 13, 21, 10, and 8 portfolio optimizations for the time horizon T 
equal to 1, 2, 3, 6, 9, 12, and 24 months, respectively (for the 24 months case, in order to 
improve the statistics, we repeated the optimization process starting from 1 January 1998). 
In order to compare risk levels at different time horizons, we report annualized risks in all 
figures and tables. 

B. Performance estimators 

To evaluate the performance of different covariance estimators we compare portfolio re- 
alized risk, portfolio reliability (i.e. the agreement between realized and predicted risk), and 
effective portfolio diversification of the portfolios w^^\ From now on we will drop the su- 
perscripts (/). Clearly a portfolio is less risky than another when its realized risk is smaller. 
Therefore our first performance metric is the realized risk. Moreover it is important that 
the portfoho is reliable, i.e., the ex-ante prediction is close to the ex-post observation of the 
portfolio risk. We consider both an absolute measure, |5p — Sp| and a relative, \sp — Sp\/sp^ 
measure of reliability. Note that in the relative measure we normalize with respect to the 
realized risk instead of the predicted risk because the predicted risk can be very small or 
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even zero when T < N. A third aspect to evaluate the performance of a portfoho is a high 
level of diversification across stocks of the portfolio. Thus we measure the effective portfolio 
diversification of the different covariance estimator methods. Following j32] the effective 
number A^e// of stocks with a significant amount of money invested in is defined as 

Neff = (11) 

i=l 

This quantity is 1 when all the wealth is invested in one stock, whereas it is when the 
wealth is equally divided among the stocks, i.e., Wi = 1/N. When all weights are positive, 
i.e. when short selling is not allowed, the quantity A'e// has a clear meaning. On the other 
hand, when short selling is allowed there might be some ambiguity in the interpretation 
of Neff [34j. For this reason, we introduce another measure of portfolio diversification. 
Specifically we consider the absolute value of the weights and we compute the smallest 
number of stocks for which the sum of absolute weights is larger than a given percentage q 
of the sum of the absolute value of all the weights. In other words we define 

/ N 

Nq = argmin^^ \ wi\ > q^^^ \uJi\. (12) 

^ i=i 1=1 

In the following we consider g = 0.9 and we term this indicator as A^gg. A'go is the minimum 
number of stocks in the portfolio such that their absolute weight cumulate to 90% of the 
total of asset absolute weights. 



C. Realized risk and reliability of different covariance estimators 

In this Section we present the results obtained in repeated portfolio optimization per- 
formed by using the covariance estimators described in Section III. Let us first discuss the 
general qualitative behavior of the realized risk for different estimators, different time hori- 
zons T (and thus different ratios T/N) and different short selling conditions. Later we 
perform more rigorous statistical tests. 

Figure [l] shows the mean value of the realized risk (averaged over different portfolio selec- 
tion times to) as a function of the time horizon T in the case of short selling (top panel) and 
no short selling (bottom panel). When short selling is allowed (top panel), the performance 
of the Markowitz portfolio is very poor and clearly different from that of the portfolios 
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FIG. 1: Mean realized (annualized) risk Sp for portfolios obtained with the 10 different methods as a 
function of the horizon T. T=l, 2,3,6,9,12, 24 months correspond to T/N ^ 0.2, 0.4, 0.7, 2.1, 2.8, 5.6, 
respectively. The top panel considers portfolios where short selling is allowed and the bottom panel 
considers portfolios where short selling is forbidden. 

obtained with the other investigated covariance estimators. Markowitz direct optimization 
procedure gives the highest realized risk at each time window T, with the exception of T = 2 
years. Furthermore, while the realized risk curves of the other optimization procedures are 
approximately increasing functions of T (except shrinkage to common covariance) , the real- 
ized risk of the Markowitz portfolio is non monotonic: the realized risk is very high at T = 3 
and 6 and decreases around those values. The non monotonic behavior of the Markowitz 
direct optimization method can be explained as follows. When short selling is allowed, a 
high realized risk at T ~ 4.5 months is expected because T ^ N (i.e., T ^ 90 days=4.5 
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months in our case) is the crossing point from non singular to singular covariance matrices. 
In fact, in References [22l - [25| . a divergence of the realized risk is shown to occur in the limit 
T ^ oc, ^ oc and T/N ^ 1 from the right. Here we verify this behavior and we observe 
the divergence also when T/N 1 from the left. From the top panel of Fig. [l]we can also 
see how spectral and hierarchical clustering methods show a similar performance in terms of 
realized risk. Shrinkage methods have a performance similar to that of the other algorithms, 
but the shrinkage to common covariance method shows a relatively poorer performance for 
low values of T while it shows one of the best performances for high values of T. 

The bottom panel of figure [T] shows the mean realized risk as a function of the time horizon 
T when the no short selling condition is imposed. In this case too, the realized risk of all 
portfolios approximately increases with T except again for the Markowitz optimization and 
the shrinkage to common covariance method. Moreover, for T larger than all the methods 
are roughly equivalent in terms of realized risk. For T < N ^ Markowitz and shrinkage to 
common covariance have clearly a high realized risk, while the other methods are again 
essentially equivalent (with the possible exception of HausdorflF estimator for T = 3 months). 
Finally, overall, except for the Markowitz portfolio, a comparison of the top and bottom 
panels of Fig. [T] shows that the realized risk of all portfolios turns out to be approximately 
the same both when constraints on short selling are applied and when they are not. 

In the previous analysis we have considered the average realized risk over repeated opti- 
mizations for diflFerent time horizons T. Now, we fix T and consider the realized risk time 
series to explore the role and nature its fiuct nations in different market conditions. We 
compare these time series for diflFerent values of the time horizon T. In Figure [2] we show the 
time series of the realized risk as a function of the optimization time for the Markowitz 
direct optimization and for two representative covariance estimation methods (the shrinkage 
to common covariance and the RMT-M) when T = 1, 3, 6, and 12 months and short selling 
is allowed. From the figure it is evident that, for a given method, the temporal fiuctuations 
in the time series of the realized risk are typically larger than the typical diflFerences between 
the realized risk of the diflFerent methods. The same is true if we compare other estimators 
and also when short selling is not allowed. The observed high fiuctuations in the realized 
risk indicate that, for a detailed comparison of different portfolio performances, a compar- 
ison of the relative differences between portfolio realized risks is more appropriate than a 
comparison of the average realized risk (averaged over different portfolio selection times). 
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(year) 



FIG. 2: Time series of the realized risk Sp over the 11 years of the Markowitz, the RMT-M, and the 
shrinkage to common covariance portfohos for a portfoho horizon T equal to 1 (top left panel), 3 
(top right panel), 6 (bottom left panel), and 12 (bottom right panel) months. In these optimizations 
short selling is allowed. 

For example, let us consider the yearly case (bottom right panel of Fig. [2]). The realized risk 
of the Markowitz (black circles) and shrinkage to common covariance (red circles) portfolios 
averaged over the 11 year time period are 13.6% ± 1.3% and 12.1% ± 1.1%, respectively, 
where errors are standard errors. From these numbers one would conclude that the two 
methods are equivalent in terms of realized risk. On the contrary, from the time series in 
the bottom right panel of Fig. [2} one concludes that the realized risk of the shrinkage to 
common covariance portfolio is systematically smaller than the one of Markowitz portfolio. 
In fact, our results show that, for a yearly investment horizon when short selling is allowed, 
the shrinkage to common covariance method outperforms all of the other methods. 

For these reasons we measure portfolio performances relative to the Markowitz portfolio 
by means of quantity 1 — Sp/ s^p^^ where Sp is the realized risk of the investigated portfolio 
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and 5^^^ is the realized risk for the Markowitz portfoho in the same period and conditions. 
This quantity measures how the investigated portfoho outperforms the Markowitz portfoho 
(in percentage) in terms of reahzed risk. To assess the statistical robustness of the difference 
observed between a result obtained with a given covariance estimator and the Markowitz 
one, we perform a t-test to evaluate whether the difference s^J^^ — Sp has mean value equal to 
zero. Similarly, in order to test whether a given portfolio is more reliable than the Markowitz 
one we perform a t-test to evaluate whether the difference {sp^"^ — Sp^^'l — \sp — Sp\ is different 
from zero. Here Sp and Sp^"^ are the predicted risk for the investigated and the Markowitz 
portfolio, respectively. 

A quantitative comparison of all the covariance estimator methods is provided in Tables 
[H [n| and in for the cases T = 1 year, 6 months, and 1 month, respectively, for both the 
case when short selling is allowed and when it is not. Since = 90, in the first two cases it 
is T > A, while in the third case it is T < A. 

Let us discuss first the case in which short selling is allowed. Comparing the mean values 
of 1 — Sp/s^p^^ (third column in the Tables) and the results of the t-tests, we conclude that 
relative portfolio performances depend on the investment horizon T. For a yearly horizon, all 
methods except SI and UPGMA outperform the Markowitz portfolio and the best method is 
shrinkage to common covariance (as already noted above) which has a realized risk an 11% 
smaller on average than the Markowitz portfolio. Note that when T is equal to one year, 
RMT-M also performs similarly well. In fact the average realized risk for this method is 
10.4% smaller than the Markowitz one. However for lower time horizons a different pattern 



emerges. When T = 6 months (Table II), all portfolios perform equally well compared 
to the Markowitz portfolio, being roughly 33% less risky than the Markowitz portfolio. 



When T = 1 month (see Table III), all methods except shrinkage to common covariance 
outperform Markowitz direct optimization. The spectral methods SI, RMT-0, and RMT-M 
perform the best and equally well. Among shrinkage methods, shrinkage to SI and shrinkage 
to constant correlation perform almost as well as the spectral methods, while the shrinkage 
to common covariance portfolio is the worst, having a realized risk which is statistically 
indistinguishable from the Markowitz portfolio. By considering the reliability which is given 
in the last column of the Tables, we conclude that all the methods outperform Markowitz 
with a single exception observed for the SI covariance estimator when T = 1 year. Again 
the degree of improvement is enhanced when T = 6 months. 
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We now consider the no short seUing case. As anticipated in the previous discussion, for 
T > aU portfohos have similar reahzed risks and the observed values are quite close to 
those observed in the absence of no short selling constraint. This is confirmed by the results 
shown in the bottom part of Tables [l] and II, For T = 1 year the quantity 1 — s^/s^^^ is 
essentially consistent with zero for all portfolios. When T = 6 months only the shrinkage 
to single index estimator performs slightly better than Markowitz direct optimization at a 
5% confidence level. For T = 1 month (Table [Till ) a diflFerent result emerges. In fact, all 
portfolios have a significantly smaller realized risk than the Markowitz portfolio. The only 
notable exception is the shrinkage to common covariance portfolio that presents the same 
(bad) performance as the Markowitz portfolio. The best results for the realized risk are 
observed for hierarchical clustering methods and for the shrinkage to constant correlation 
method. Moreover the spectral methods perform slightly worse than the others with respect 
to risk forecasting. 

Note that when T/N ^ 1 the bad performance of Markowitz portfolio, observed when 
short selling constraints are not imposed, is no longer present. The no short selling con- 
straint makes the Markowitz optimization procedure essentially equivalent to an optimiza- 
tion procedure that has been performed with more robust covariance estimators. Again this 
observation is in agreement with the conclusion that imposing no short selling constraint on 
the portfolio optimization procedure is somehow equivalent to minimize estimation errors 
in the input to the optimization problem [6] . 



D. Portfolio diversification 



One further aspect to investigate concerns the degree of diversification of portfolios. As 
for the realized risk, for the Markowitz direct optimization and for any given covariance 
estimator, we observe large fiuct nations of the participation ratio as the portfolio estimation 
time to varies. We therefore consider both the mean and the standard error of Agjj for each 
method across time and the mean value of N^ff/N^^j — 1 in percentage, where N^^j is 
the participation ratio for the Markowitz portfolio. This variable is a relative measure that 
quantifies the portfolio diversification with respect to the diversification of the benchmark 
Markowitz portfolio. Also in this case we perform a t-test in order to evaluate whether the 
observed diflFerence N^^) — N^ff is compatible with a null hypothesis assuming that its mean 
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TABLE I: Different portfolio performance measures that combine (annualized) predicted Sp and 
realized Sp risks. 10 different methods are compared for an horizon of T = 1 year. The numbers 
are average over the different portfolios and the errors are standard errors. For Sp and \sp — 
Sp\ we report the result of a t-test evaluating whether the difference of each quantity with the 
corresponding quantity for the Markowitz portfolio has mean value equal to zero. The value of 
the null hypothesis is below a 1% threshold when the symbol ** is present while is below 5% when 
the symbol * is present. 



Year - s.s. 


Sp Sp 


1 *p 

J- (M) 

Op 


1 ^p 1 


Markowitz 


6.97 ± 0.63 13.6 ± 1.3 


± 


6.7 ± 1.1 


SI 

RMT-0 
RMT-M 


5.94 ± 0.41 13.2 ± 1.3 - 
7.18 ± 0.67 12.4 ± 1.2** 
7.24 ± 0.68 12.2 ± 1.2** 


2.7 ± 5.0 
9.5 ± 2.5 
10.4 ± 2.4 


7.2 ± 1.2 - 
5.2 ± 1.1** 
5.1 ± 1.0** 


UPGMA 

WPGMA 

Hausdorff 


8.23 ± 0.88 13.0 ± 1.3 - 
7.88 ± 0.82 12.6 ± 1.3* 
7.57 ± 0.80 12.3 ± 1.2* 


5.0 ± 2.3 
7.6 ± 2.6 
9.3 ± 3.0 


4.8 ± 1.1** 
4.8 ± 1.1** 
4.75 ± 0.99** 


Shr. to SI 
Shr. C. Gov. 
Shr. G. Gorr. 


7.59 ± 0.70 12.3 ± 1.1** 
10.54 ± 0.91 12.1 ± 1.1** 
8.33 ± 0.81 12.8 ± 1.2** 


9.09 ± 0.90 
11.0 ± 1.7 
6.3 ± 1.0 


4.76 ± 0.98** 
2.57 ± 0.69** 
4.5 ± 1.0** 


Year - no s.s. 


Sp Sp 


1 

^ AM) 
Op 


\Sp Sp\ 


Markowitz 


9.46 ± 0.88 12.7 ± 1.2 


± 


4.06 db 0.93 


SI 

RMT-0 
RMT-M 


7.90 ± 0.64 12.9 ± 1.2 - 
9.18 ± 0.84 12.8 ± 1.2 - 
9.08 ± 0.83 12.8 ± 1.2 - 


-2.2 ± 3.0 
-0.34 ± 0.97 
0.07 ± 0.95 


5.5 ± 1.2 - 
4.33 ± 0.98 - 
4.33 ± 0.98 - 


UPGMA 
WPGMA 

Hausdorff 


9.9 ± 1.0 12.9 ± 1.3 - 
9.01 ± 0.89 12.7 ± 1.2 - 
8.68 ± 0.91 12.5 ± 1.1 - 


-0.70 ± 0.98 
0.2 ± 1.5 
1.7 ± 2.1 


3.93 ± 0.97 - 
4.11 ± 0.98 - 
4.14 ± 0.95 - 


Shr. to SI 
Shr. G. Gov. 
Shr. G. Gorr. 


9.35 ± 0.85 12.6 ± 1.1 - 
11.7 ± 1.0 12.2 ± 1.1 - 
10.05 ± 0.98 12.8 ± 1.2 - 


0.75 ± 0.42 
3.4 ± 1.9 
-0.43 ± 0.90 


4.01 ± 0.93 - 
2.40 ± 0.72 - 
3.92 ± 0.92 - 
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TABLE II: Different portfolio performance measures that combine (annualized) predicted Sp and 
realized Sp risks. 10 different methods are compared for an horizon of T = 6 months. The 
numbers are average over the different portfolios and the errors are standard errors. For Sp and 
— 5p| we report the result of a t-test evaluating whether the difference of each quantity with the 
corresponding quantity for the Markowitz portfolio has mean value equal to zero. The value of 
the null hypothesis is below a 1% threshold when the symbol ** is present while is below 5% when 
the symbol * is present. 



6 months - s.s. 


Sp Sp 


1 

(M) 

Op 


1 ^p 1 


Markowitz 


4.23 ± 0.30 18.1 ± 1.5 


± 


13.9 ± 1.4 


SI 

RMT-0 
RMT-M 


5.52 ± 0.33 12.05 ± 0.92** 
6.10 ± 0.42 11.91 ± 0.96** 
6.17 ± 0.43 11.80 ± 0.95** 


31.3 ± 3.0 

32.4 ± 3.3 
33.0 ± 3.2 


6.53 ± 0.83** 
5.81 ± 0.82** 
5.63 ± 0.82** 


UPGMA 

WPGMA 

Hausdorff 


7.46 ± 0.57 12.12 ± 0.91** 
7.22 ± 0.56 11.86 ± 0.86** 
6.48 ± 0.55 11.82 ± 0.82** 


31.1 ± 3.1 

32.3 ± 3.1 

32.4 ± 3.0 


4.66 ± 0.76** 
4.65 ± 0.74** 
5.34 ± 0.77** 


Shr. to SI 
Shr. C. Gov. 
Shr. G. Gorr. 


6.41 ± 0.43 11.72 ± 0.82** 
10.77 db 0.76 11.73 ± 0.80** 
7.51 ± 0.53 12.05 ± 0.88** 


33.4 ± 2.4 
33.2 ± 2.4 
31.7 ± 2.7 


5.30 ± 0.65** 
2.82 ± 0.55** 
4.54 ± 0.67** 


6 months - no s.s. 


Sp Sp 


1 

Op 


\Sp Sp\ 


Markowitz 


8.57 ± 0.63 11.85 ± 0.87 


± 


3.94 ± 0.69 


SI 

RMT-0 
RMT-M 


7.40 ± 0.52 11.98 ± 0.86 - 
8.27 ± 0.62 11.83 ± 0.86 - 
8.20 ± 0.61 11.81 ± 0.86 - 


-1.7 ± 1.5 
-0.1 ± 1.0 
0.1 ± 1.0 


4.92 ± 0.78* 
4.17 ± 0.72 - 
4.21 ± 0.72 - 


UPGMA 
WPGMA 
Hausdorff 


9.19 ± 0.72 11.83 ± 0.89 - 
8.42 ± 0.67 11.79 ± 0.87- 
7.45 ± 0.67 12.04 ± 0.82 - 


0.26 ± 0.96 
0.4 ± 1.0 
-2.5 ± 1.5 


3.57 ± 0.72 - 
3.75 ± 0.78 - 
4.88 ± 0.83* 


Shr. to SI 
Shr. G. Gov. 
Shr. G. Gorr 


8.48 ± 0.61 11.69 ± 0.87* 
11.79 ± 0.84 11.84 ± 0.85 - 
9.48 ± 0.71 11.86 ± 0.93 - 


1.31 ± 0.51 
-0.6 ± 2.2 
0.5 ± 1.1 


3.87 ± 0.71 - 
3.30 ± 0.63 - 
3.42 ± 0.73* 
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TABLE III: Different portfolio performance measures that combine predicted Sp and the reahzed 
Sp annuahzed risk. 10 different methods are compared for an horizon of T = 1 month. The 
numbers are average over the different portfohos and the errors are standard errors. For Sp and 
l^p — we report the result of a t-test evaluating whether the difference of each quantity with the 
corresponding quantity for the Markowitz portfolio has mean value equal to zero. The value of 
the null hypothesis is below a 1% threshold when the symbol ** is present while is below 5% when 
the symbol * is present. 



Month - s.s. 


Sp Sp 


1 

P 




Markowitz 


± 12.59 ± 0.41 


± 


12.59 ± 0.41 


SI 

RMT-0 

RMT-M 


4.15 ± 0.12 11.00 ± 0.42** 
3.84 ± 0.11 10.94 ± 0.39** 

3.90 ± 0.12 10.91 ± 0.39** 


12.1 ± 1.5 
12.5 ± 1.4 

12.8 ± 1.4 


6.85 ± 0.37** 
7.10 ± 0.34** 

7.01 ± 0.34** 


UPGMA 
WPGMA 
Hausdorff 


5.01 ± 0.17 11.66 ± 0.45** 
4.74 ± 0.17 11.44 ± 0.44** 
4.98 ± 0.17 11.62 ± 0.45** 


6.6 ± 2.1 
8.3 ± 1.9 
7.0 ± 2.1 


6.65 ± 0.38** 
6.70 ± 0.37** 
6.64 ± 0.37** 


Shr. to SI 
Shr. C. Gov. 
Shr. G. Gorr. 


3.48 ± 0.15 11.04 ± 0.39** 
13.1 ± 0.47 12.44 ± 0.42 - 
5.87 ± 0.20 11.56 ± 0.45** 


11.8 ± 1.2 
0.5 ± 1.5 
7.4 ± 1.9 


7.57 ± 0.35** 
3.64 ± 0.30** 
5.70 ± 0.37** 


Month- no s.s. 


Sp Sp 


1 ^P 
Op 


1 1 


Markowitz 


4.38 ± 0.24 13.09 ± 0.52 


± 


8.73 ± 0.53 


SI 

RMT-0 
RMT-M 


5.60 ± 0.20 11.60 ± 0.44** 

5.48 ± 0.21 11.57 ± 0.42** 

5.49 ± 0.21 11.54 ± 0.42** 


9.3 ± 1.4 
9.5 ± 1.2 
9.7 ± 1.2 


6.04 ± 0.39** 
6.11 ± 0.38** 
6.07 ± 0.38** 


UPGMA 
WPGMA 

Hausdorff 


7.11 ± 0.25 11.45 ± 0.44** 
6.15 ± 0.22 11.48 ± 0.44** 
6.73 ± 0.23 11.53 ± 0.43** 


10.8 ± 1.3 
10.6 ± 1.2 
10.3 ± 1.2 


4.54 ± 0.37** 
5.39 ± 0.38** 
4.87 ± 0.34** 


Shr. to SI 
Shr. C. Gov. 
Shr. G. Gorr. 


5.72 ± 0.21 11.76 lb 0.43** 
13.39 ± 0.48 12.74 ± 0.44 - 
8.20 ± 0.29 11.56 ± 0.47** 


8.64 ± 0.91 
-2.6 ± 2.6 
10.3 ± 1.4 


6.06 ± 0.38** 
3.76 ± 0.30** 
3.93 ± 0.35** 
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TABLE IV: Absolute and relative participation ratio measure N^ff of the portfolios obtained with 
the 10 covariance estimators for different horizons of T = 1, 6 and 12 months. Short selling is not 
allowed. The numbers are average over the different portfolios and the errors are standard errors. 
For Nf^ff we report the result of a t-test evaluating whether the difference with the corresponding 
quantity for the Markowitz portfolio has mean value equal to zero. The value of the null hypothesis 
is below a 1% threshold when the symbol ** is present while is below 5% when the symbol * is 
present. 





One month 

eff 


Six months 

AT ff ^^ff 1 

^^e// (M) 
efS 


One year 

^^e// (M) J- 

eff 


Markowitz 


6.80 ± 0.22 0.0± 0.0 


9.8 ± 1.0 0.0±0.0 


9.9 ± 1.5 0.0± 0.0 


SI 

RMT-0 
RMT-M 


14.91 ± 0.98** 104.0 ± 8.4 
13.45 ± 0.80** 85.4 ± 6.2 
13.63 ± 0.81** 87.9 ± 6.2 


14.0 ± 2.1** 36.8 ± 7.5 
11.2 ± 1.3** 13.4 ± 2.7 
11.6 ± 1.3** 16.9 ± 2.9 


13.8 ± 2.7 33.4 ± 9.2* 
10.6 ± 1.7 6.8 ± 4.0 - 

10.9 ± 1.7 10.1 ± 4.0 - 


UPGMA 
WPGMA 
Hausdorff 


8.90 ± 0.44** 26.5 ± 3.5 
11.62 ± 0.53** 67.6 ± 4.3 
9.55 ± 0.34** 42.4 ± 3.3 


10.2 ± 1.1** 5.1 ± 3.7 
12.1 ± 1.1** 26.3 ± 5.2 
13.1 ± 1.4** 36.0 ± 5.5 


10.7 ± 1.8 6.7 ± 4.6 - 
13.0 ± 1.9 30.5 ± 3.6** 
13.0 ± 1.8 34.9 ± 4.6** 


Shr. to SI 
Shr. C. Gov. 
Shr. C. Gorr. 


11.7 ± 0.67** 60.9 ±5.1 
37.3 ± 1.4** 530 ± 45 
7.64 ± 0.43** 7.5 ± 3.8 


11.3 ± 1.4** 11.8 ± 2.2 
18.9 ± 1.5** 159 ± 64 
10.1 ±1.2- -0.1 ± 2.6 


10.7 ± 1.8 7.3 ± 1.8** 
15.5 ± 1.8 100 ± 51** 
10.0 ± 1.7 -1.3 ± 2.8 - 



value is zero. 



In Table 



IV 



we report the average and standard error for N^ff and N^ff/N^j^j^^ — 1 for 
the 10 optimization methods and for T = 1 month, 6 months, and 1 year, together with 
the related results for the t-test. The Table shows a different behavior at different values 
of the investment time window T. Specifically, at T = 1 month all methods present a 
participation ratio which is higher than the one observed for Markowitz direct optimization. 
When T = 6 months all methods still outperform Markowitz with the exception of the 
shrinkage to constant correlation. When T = 1 year there are still several methods that 
outperforms Markowitz, namely SI, WPGMA, Hausdorff, shrinkage to single index and 
shrinkage to common covariance. The method with the highest participation ratio at any 
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TABLE V: Absolute and relative participation ratio measure A^go of the portfolios obtained with 
the 10 covariance estimators for different horizons of T = 1, 6 and 12 months. Short selling is not 
allowed. The numbers are average over the different portfolios and the errors are standard errors. 
For A^9o we report the result of a t-test evaluating whether the difference with the corresponding 
quantity for the Markowitz portfolio has mean value equal to zero. The value of the null hypoth- 
esis is below a 1% threshold when the symbol ** is present while is below 5% when the symbol * 
is present. 



Short selhng 






One 


month 


Six 


months 


One 


year 










N90 1 


Ngo 




- 1 


N90 


iV9o 1 


Markowitz 


59.41 




0.18 


0.0 ± 0.0 


56.81 ± 0.52 


0.0 ± 


0.0 


55.3 ± 0.99 


0.0 ± 0.0 


SI 


52.85 


± 


0.31** 


-10.95 ± 0.59 


55.48 ± 0.71 - 


-2.2 ± 


1.4 


55.1 ± 1.2 - 


-0.3 ± 1.6 


RMT-0 


53.87 




0.29** 


-9.23 ± 0.54 


55.57 ± 0.67 - 


-2.1 ± 


1.2 


55.1 ± 0.95 - 


-0.2 ± 1.6 


RMT-M 


53.85 




0.29** 


-9.26 ± 0.54 


55.38 ± 0.68* 


-2.4 ± 


1.2 


55.1 ± 0.97 - 


-0.2 ± 1.6 


UPGMA 


52.27 




0.29** 


-11.91 ± 0.55 


54.57 ± 0.49** 


-3.8 ± 


1.1 


55.6 ± 0.97 - 


0.7 ± 1.9 


WPGMA 


51.64 




0.28** 


-12.96 ± 0.56 


54.14 ± 0.67** 


-4.6 ± 


1.3 


54.9 ± 1.0 - 


-0.6 ± 2.0 


Hausdorff 


52.03 


± 


0.26** 


-12.31 ± 0.52 


52.48 ± 0.70** 


-7.6 ± 


1.2 


53.7 ± 1.1 - 


-2.7 ± 2.1 


Shr. to SI 


53.45 


± 


0.29** 


-9.97 ± 0.50 


54.38 ± 0.63** 


-4.2 ± 


1.0 


55.0 ± 1.1 - 


-0.5 ± 1.5 


Shr. C. Gov. 


60.89 




0.35** 


2.57 ± 0.61 


57.81 ± 0.49 - 


1.9 ± 


1.3 


57.2 ± 1.0 - 


3.6 ± 2.1 


Shr. G. Gorr 


52.97 




0.31** 


-10.71 ± 0.62 


53.95 ± 0.64** 


-5.0 ± 


1.1 


54.6 ± 1.0 - 


-1.24 ± 0.94 


No short selhng 






One 


month 


Six 


months 


One 


year 








N90 


N90 1 


-^90 


N90 


- 1 


^90 


.'V90 1 


Markowitz 


8.40 ± 0.19 


0.0 ± 0.0 


12.81 ± 1.00 


0.0 ± 0.0 


13.4 ± 1.5 


0.0 ± 0.0 


SI 


18.9 ± 1.1** 


113.3 ± 8.2 


17.0 ± 2.0** 


31.1 ± 


7.6 


16.4 ± 2.6 - 


18.7 ± 8.0 


RMT-0 


17.21 




0.85** 


95.9 ± 6.1 


13.8 ± 1.2* 


8.9 ± 


4.0 


13.4 ± 1.7- 


-0.7 ± 3.7 


RMT-M 


17.40 




0.85** 


98.3 ± 6.0 


14.3 ± 1.2** 


12.5 ± 


3.9 


13.9 ± 1.7- 


3.1 ± 3.3 


UPGMA 


11.55 




0.48** 


33.3 ± 3.4 


12.9 ± 1.1 - 


-0.8 ± 


3.6 


13.2 ± 1.9 - 


-4.5 ± 5.0 


WPGMA 


15.39 




0.59** 


79.8 ± 4.4 


15.6 ± 1.2** 


23.5 ± 


5.7 


16.1 ± 1.7** 


20.5 ± 3.4 


Hausdorff 


12.61 




0.34** 


51.5 ± 2.9 


17.4 ± 1.4** 


37.4 ± 


4.9 


16.4 ± 1.4** 


25.7 ± 4.9 


Shr. to SI 


15.24 


± 


0.74** 


72.4 ± 5.2 


14.6 ± 1.4** 


12.5 ± 


3.0 


14.4 ± 1.9 - 


5.7 ± 2.7 


Shr. G. Gov. 


37.4 ± 1.2** 


363 ± 20 


21.3 ± 1.3** 


85 ± 


22 


18.8 ± 1.7** 


46 ± 10 


Shr. G. Gorr 


10.00 




0.51** 


14.3 ± 3.9 


12.7 ± 1.3 - 


-4.2 ± 


4.0 


13.5 ± 1.9 - 


-1.8 ± 4.8 
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time horizon is the shrinkage to common covariance. For example, when T = 1 month it has 
a participation ratio which is 530% higher than the Markowitz portfoho on average. This 
high diversification is not shared with the other two shrinkage methods. This is probably 
due to the fact that the target matrix of the shrinkage to common covariance assumes that 
all the stocks are equivalent. SI among the spectral methods and WPGMA among the 
hierarchical clustering methods have the highest participation ratio of the other classes of 
covariance estimators. 

In the above discussion, we have used N^ff to quantify the portfolio diversification under 
no short selling constraint. In fact, we have already discussed that this indicator is not 
meaningful when short selling is allowed. For this reason, we now consider the second 
participation ratio indicator, A^go, introduced above. Table [V| reports the mean and the 
standard error of Nqq for each method averaged across investment time and, as before, a 
relative measure both when short selling is allowed and when it is forbidden. We also perform 
a t-test to evaluate whether the difference Ag^^ — Ago has a mean value significantly diflFerent 
from zero. 

When short selling is not allowed Ago gives results very close to those observed for N^ff. 
In fact when T = 1 month all the methods give a portfolio more diversified than Markowitz 
direct optimization. When T = 6 months all the methods outperform Markowitz with the 
exception of shrinkage to constant correlation and UPGMA, whereas when T = 1 year, only 
WPGMA, Hausdorff and shrinkage to common covariance still outperform Markowitz. When 
short selling is allowed, Markowitz direct optimization provides portfolios characterized by 
a Ago value slightly higher or statistically compatible with the value observed for the other 
methods. The only exception is shrinkage to common covariance when T = 1 month but 
also in this case the difference observed, although statistically validated, is a very small. 

In summary, when short selling is allowed the weights have a similar structure indepen- 
dently of the method, and the wealth (positive or negative) is roughly concentrated in 55 
stocks. When short selling is not allowed, a large variety of behaviors is observed depending 
on the method and on the investment time horizon. In general, the shrinkage to common 
covariance method has the largest participation ratio. 

When short selling is allowed, it is also worth analysing the amount of short selling 
required by the optimization procedures of the global minimum variance portfolio. To 
quantify this aspect in Fig. |3] we show, for each method, the average value of the ratio 
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W-/wj^ where W- is the sum of the absolute value of all negative weights present in the 
portfolio and Wj^ is the sum of all positive weights. The ratio w^/wj^ ranges from (absence 
of short selling) to about 1 (negative weights of the same size as positive weights). 

Fig. |3] shows that Markowitz direct optimization requires the highest fraction of short 
selling positions. This property is maximal when T /N ^ 1. All the other methods present a 
significant lower mean value of w^/w^. The specific values depend on the specific covariance 
estimation method and are slightly affected by the value of the investment horizon T. In fact, 
a slight increase of w^/w^ is observed when T is increasing. The lowest value w^/w^ ^ 0.28 
is observed for the SI model whereas the highest value w^/w^ ^ 0.40 is observed for the 
shrinkage to constant correlation method. The region of worst performance of the Markowitz 
direct optimization procedure is therefore associated with the maximal amount of portfolio 
wealth allocated in stocks that need to be sold short. 

These results provide empirical support to the conclusion that Markowitz direct opti- 
mization in the presence of short selling suffers of an over exposure to short selling. This 
over exposure is maximal when T/N ^ 1 and is progressively mitigated both when T > N 
and when T < N. On the contrary, reducing the estimation errors on the covariance matrix 
estimation implicitly limits the amount of short selling positions requested in the optimal 
portfolio. According to the results obtained in Ref. [6j and to the empirical results obtained 
in this study, we observe that the reverse is also true. In fact imposing no short selling 
conditions to the Markowitz optimization reduces the estimation errors in the covariance 
matrix for any value of T, and especially when T/N ^ 1. 

V. CONCLUSIONS 

The portfolio optimization problem is significantly affected by estimation errors of the 
covariance matrix. For this reason many estimators alternative to the sample covariance 
matrix have been proposed in the literature. In this respect, two important and related 
questions are: (i) which aspects of the portfolio optimization can be improved with im- 
proved covariance matrix estimators? (ii) when, i.e. under which conditions, are improved 
covariance estimators really useful in enhancing the performance of the corresponding op- 
timal portfolios? We have investigated these questions by considering 9 different methods 
for estimating the covariance matrix and we have quantitatively compared the relative ef- 
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FIG. 3: Mean value of the ratio W-/wj^ between the sum of absolute value of negative weights and 
the sum of positive weights for the portfolios obtained with the 10 different methods as a function 
of the horizon T. 

ficiency of the corresponding portfolios with respect to the benchmark Markowitz portfolio 
on a series of repeated investment exercises over 11 years. The portfolio optimization has 
been performed under different conditions: different estimation-investment horizons T, i.e., 
different values of T/N [N = 90), and the presence/absence of short selling constraints. 
Despite the realized risk and the degree of portfolio diversification of the resulting portfolios 
constructed with the different covariance estimators show large fluctuations, relative per- 
formances of different methods turn out to be quite persistent over time. Under different 
market conditions some persistent behaviors can be observed. For a specific choice of both 
the length of the estimation-investment horizon and the presence/absence of constraints on 
sort selling an estimator might be useful in improving a specific aspect of the optimization, 
but under a different choice the same method might not lead to a significant improvement 
on the same aspect. 
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Specifically, when T/N > 1 various covariance estimators lead to optimal portfolios with 
similar realized risk and portfolio diversification. In this regime, Markowitz direct optimiza- 
tion has an overall good performance both with and without short selling constraints. While 
when short selling is allowed a portfolio less risky than the Markowitz one can be obtained 
by using improved covariance estimators, when short selling is forbidden the investigated 
estimators are not able to decrease the risk of the portfolio with respect to the Markowitz 
one. In this last case some covariance estimators lead to higher portfolio diversification. 

On the other hand, when T/N is close to 1, portfolio performances are greatly influenced 
by the addition of no short selling constraints. Specifically, when short selling is allowed, 
we observe how the Markowitz direct optimization process has the worst performance. This 
result is consistent with the theoretical observations given in Ref. [6j and with the observa- 
tion of the divergence of estimation errors of covariance matrix associated with this regime 
[22] - [25] . Under this condition all the investigated covariance estimators provide portfolios 
with lower realized risk, higher reliability and smaller exposure to short selling. Their perfor- 
mances are quite similar with respect to realized risk, reliability and portfolio diversification 
but differences are observed with respect to the degree of exposure to short selling. When 
no short selling constraints are applied, we observe a different scenario. All covariance esti- 
mators lead to portfolios with realized risks and reliabilities that are statistically consistent 
with those obtained by Markowitz direct optimization. However, portfolios constructed with 
the investigated methods have a higher degree of diversification than those observed for the 
Markowitz direct optimization. This result is consistent with the theoretical and empirical 
conclusions reached in Ref. [6j where it was shown that adding short selling constraints to 
the Markowitz portfolios can have the same effect as using a better estimate of the covari- 
ance matrix (using the shrinkage estimator in their case). Our results suggest that indeed 
this conclusion successfully applies also to other covariance estimators such as the methods 
investigated in this paper. 

When T/N smaller than one, the worst performance with respect to realized risk is 
obtained for Markowitz direct optimization and shrinkage to common covariance. This 
result indicates that one should not use the sample covariance matrix in this regime (neither 
with nor without short selling). Also the use of pseudoinverse gives portfolios with very 
poor performance. All the other methods lead to portfolios with better performances with 
respect to realized risk and reliability in realized risk forecasts both in the presence and 
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in the absence of short seUing. When the no short seUing constraint is imposed, portfoho 
diversification is better achieved when filtered covariance estimators are used. This last 
observation is also true for the shrinkage to common covariance estimator both when short 
selling is allowed and when it is forbidden. Indeed this method presents the highest degree 
of portfolio diversification. It is therefore worth noting that the observation that Markowitz 
and shrinkage to common covariance portfolios are characterized by similar values of the 
realized risk does not imply that they have a similar composition. In fact the portfolio 
obtained with the shrinkage to common covariance method is systematically more diversified. 
The conclusion reached in Ref. [6j and empirically observed by us when T/N ^ 1 does not 
seem to hold when T/N is less than one. In fact portfolios obtained with Markowitz direct 
optimization are characterized by realized risks, reliability of risk forecasts and portfolio 
diversification that are worse than most of other methods based on covariance estimators 
also when short selling is forbidden. 

In summary the use of efficient covariance estimators improves different aspects of the 
portfolio optimization process. The degree of improvement depends on the selected method, 
the value of the parameter T/N^ and the presence or absence of no short selling constraint. 
The improvements achieved refer to one or more of the following key portfolio indicators: (i) 
realized risk, (ii) reliability of realized risk predictions, (iii) degree of portfolio diversification 
and (iv) fraction of short selling when short selling is allowed. 
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